Search CORE

114 research outputs found

Discriminant Projection Representation-based Classification for Vision Recognition

Author: Feng Qingxiang
Zhou Yicong
Publication venue
Publication date: 19/11/2017
Field of study

Representation-based classification methods such as sparse representation-based classification (SRC) and linear regression classification (LRC) have attracted a lot of attentions. In order to obtain the better representation, a novel method called projection representation-based classification (PRC) is proposed for image recognition in this paper. PRC is based on a new mathematical model. This model denotes that the 'ideal projection' of a sample point

x

on the hyper-space

H

may be gained by iteratively computing the projection of

x

on a line of hyper-space

H

with the proper strategy. Therefore, PRC is able to iteratively approximate the 'ideal representation' of each subject for classification. Moreover, the discriminant PRC (DPRC) is further proposed, which obtains the discriminant information by maximizing the ratio of the between-class reconstruction error over the within-class reconstruction error. Experimental results on five typical databases show that the proposed PRC and DPRC are effective and outperform other state-of-the-art methods on several vision recognition tasks.Comment: Accepted by the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models

Author: Hong Yicong
Wu Qi
Zhou Gengze
Publication venue
Publication date: 29/05/2023
Field of study

Trained with an unprecedented scale of data, large language models (LLMs) like ChatGPT and GPT-4 exhibit the emergence of significant reasoning abilities from model scaling. Such a trend underscored the potential of training LLMs with unlimited language data, advancing the development of a universal embodied agent. In this work, we introduce the NavGPT, a purely LLM-based instruction-following navigation agent, to reveal the reasoning capability of GPT models in complex embodied scenes by performing zero-shot sequential action prediction for vision-and-language navigation (VLN). At each step, NavGPT takes the textual descriptions of visual observations, navigation history, and future explorable directions as inputs to reason the agent's current status, and makes the decision to approach the target. Through comprehensive experiments, we demonstrate NavGPT can explicitly perform high-level planning for navigation, including decomposing instruction into sub-goal, integrating commonsense knowledge relevant to navigation task resolution, identifying landmarks from observed scenes, tracking navigation progress, and adapting to exceptions with plan adjustment. Furthermore, we show that LLMs is capable of generating high-quality navigational instructions from observations and actions along a path, as well as drawing accurate top-down metric trajectory given the agent's navigation history. Despite the performance of using NavGPT to zero-shot R2R tasks still falling short of trained models, we suggest adapting multi-modality inputs for LLMs to use as visual navigation agents and applying the explicit reasoning of LLMs to benefit learning-based models

arXiv.org e-Print Archive

Genetic learning particle swarm optimization

Author: Chung Henry Shu-Hung
Gong Yue-Jiao
Li Jing-Jing
Li Yun
Shi Yu-Hui
Zhang Jun
Zhou Yicong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2016
Field of study

Social learning in particle swarm optimization (PSO) helps collective efficiency, whereas individual reproduction in genetic algorithm (GA) facilitates global effectiveness. This observation recently leads to hybridizing PSO with GA for performance enhancement. However, existing work uses a mechanistic parallel superposition and research has shown that construction of superior exemplars in PSO is more effective. Hence, this paper first develops a new framework so as to organically hybridize PSO with another optimization technique for “learning.” This leads to a generalized “learning PSO” paradigm, the *L-PSO. The paradigm is composed of two cascading layers, the first for exemplar generation and the second for particle updates as per a normal PSO algorithm. Using genetic evolution to breed promising exemplars for PSO, a specific novel *L-PSO algorithm is proposed in the paper, termed genetic learning PSO (GL-PSO). In particular, genetic operators are used to generate exemplars from which particles learn and, in turn, historical search information of particles provides guidance to the evolution of the exemplars. By performing crossover, mutation, and selection on the historical information of particles, the constructed exemplars are not only well diversified, but also high qualified. Under such guidance, the global search ability and search efficiency of PSO are both enhanced. The proposed GL-PSO is tested on 42 benchmark functions widely adopted in the literature. Experimental results verify the effectiveness, efficiency, robustness, and scalability of the GL-PSO

Crossref

University of Strathclyde Institutional Repository

Enlighten